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Abstract 

We use quantum walks to construct a new quantum algorithm for element distinctness and its gener- 
alization. For element distinctness (the problem of finding two equal items among N given items), we 
get an 0{N^^^) query quantum algorithm. This improves the previous OlN"^^^) quantum algorithm of 
Buhrman et al. 1141 and matches the lower bound by Q]. We also give an 0{N^/ (fc+i) ) query quantum 
algorithm for the generahzation of element distinctness in which we have to find k equal items among 
N items. 

1 Introduction 

Element distinctness is the following problem. 

Element Distinctness. Given numbers xi, . . . , x^r G [M], are they all distinct? 

It has been extensively studied both in classical and quantum computing. Classically, the best way to 
solve element distinctness is by sorting which requires Q.{N) queries. In quantum setting, Buhrman et al. 
[ 14 1 have constructed a quantum algorithm that uses 0{N'^^^) queries. Aaronson and Shi (Tl have shown 
that any quantum algorithm requires at least Q.{N'^/'^) quantum queries. 

In this paper, we give a new quantum algorithm that solves element distinctness with 0{N'^/^) queries 
to xi, . . . ,xn . This matches the lower bound of QUI. 

Our algorithm uses a combination of several ideas: quantum search on graphs fT\ and quantum walks 
QUI . While each of those ideas has been used before, the present combination is new. 

We first reduce element distinctness to searching a certain graph with vertices S C {1, . . . , N} as 
vertices. The goal of the search is to find a marked vertex. Both examining the current vertex and moving 
to a neighboring vertex cost one time step. (This contrasts with the usual quantum search I26II . where only 
examining the current vertex costs one time step.) 

We then search this graph by quantum random walk. We start in a uniform superposition over all vertices 
of a graph and perform a quantum random walk with one transition rule for unmarked vertices of the graph 
and another transition rule for marked vertices of the graph. The result is that the amplitude gathers in the 
marked vertices and, after 0{N'^/^) steps, the probability of measuring the marked state is a constant. 
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number F30602-0 1-2-0524 (at UC Berkeley), NSF Grant DMS-01 1 1298 (at IAS), NSERC, ARDA, IQC University Professorship 
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We also give several extensions of our algorithm. If we have to find whether xi, . . ., xn contain k 
numbers that are equal: Xi^ = ... = a^j^,, we get a quantum algorithm with 0{N^/^^^'^'>) queries for any 
constant' k. 

If the quantum algorithm is restricted to storing r numbers, r < N"^/^, then we have an algorithm which 
solves element distinctness with 0{N/ ^/r) queries which is quadratically better than the classical 0{N^/r) 
query algorithm. Previously, such quantum algorithm was known only for r < \/iV fT4\. For the problem 
of finding k equal numbers, we get an algorithm that uses 0( ^,(^-*'i)% ) queries and stores r numbers, for 

r < A^('^~i)/'^. 

For the analysis of our algorithm, we develop a generalization of Grover's algorithm (Lemma|3ll which 
might be of independent interest. 

1.1 Related work 

Classical element distinctness. Element distinctness has been extensively studied classically. It can be 
solved with 0{N) queries and 0(A^log A^) time by querying all the elements and sorting them. Then, any 
two equal elements must be next one to another in the sorted order and can be found by going through the 
sorted list. 

In the usual query model (where one query gives one value of Xi), it is easy to see that Q,{N) queries are 
also necessary. Classical lower bounds have also been shown for more general models (e.g. |25|). 

The algorithm described above requires Q{N) space to store all of xi, . . . , xj\[. If we are restricted to 
space S < N, the running time increases. The straightforward algorithm needs O(^) queries. Yao |38| 
has shown that, for the model of comparison-based branching programs, this is essentially optimal. Namely, 
any space-5 algorithm needs time T = )■ For more general models, lower bounds on algorithms 

with restricted space S is an object of ongoing research 1 10|. 

Related problems in quantum computing. In collision problem, we are given a 2-1 function / and 
have to find x, y such that f{x) = f{y). As shown by Brassard, H0yer and Tapp llTTl . collision problem 
can be solved in 0(iV^/^) quantum steps instead of Q{N^/'^) steps classically. i7(A^^/^) is also a quantum 
lower bound iTI ISTll . 

If element distinctness can be solved with M queries, then collision problem can be solved with 0(\/M) 
queries. (This connection is credited to Andrew Yao in Q.) Thus, a quantum algorithm for element dis- 
tinctness implies a quantum algorithm for collision but not the other way around. 

Quantum search on graphs. The idea of quantum search on graphs was proposed by Aaronson and 
Ambainis |2| for finding a marked item on a d-dimensional grid (problem first considered by Benioff 1 12 1) 
and other graphs with good expansion properties. Our work has a similar flavor but uses completely different 
methods to search the graph (quantum walk instead of "divide-and-conquer"). 

Quantum walks. There has been considerable amount of research on quantum walks (surveyed in [30|) 
and their applications (surveyed in |6J). Applications of walks |6| mostly fall into two classes. The first 
class is exponentially faster hitting times 11211 IT9ll29l . The second class is quantum walk search algorithms 

lEiEiiii. 

Our algorithm is most closely related to the second class. In this direction, Shenvi et al. |36| have 
constructed a counterpart of Grover's search |26| based on quantum walk on the hypercube. Childs and 

'The big-O constant depends on k. For non-constant k, we can show that the number of queries is 0{k'^ N''^'^''^^^). The proof 
of that is mostly technical and is omitted in this version. 
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Goldstone l22l l23l and Ambainis et al. (Sj have used quantum walk to produce search algorithms on d- 
dimensional lattices (d > 2) which is faster than the naive application of Grover's search. This direction is 
quite closely related to our work. The algorithms by 136. .22. ,8 1 and current paper solve different problems 
but all have similar structure. 

Recent developments. After the work described in this paper, the results and ideas from this paper 
have been used to construct several other quantum algorithms. Magniez et al. [32 1 have used our element 
distinctness algorithm to give an 0{n^'^) query quantum algorithm for finding triangles in a graph. Ambainis 
et al. 1 8 1 have used ideas from the current paper to construct a faster algorithm for search on 2-dimensional 
grid. Childs and Eisenberg ll20l have given a different analysis of our algorithm. 

Szegedy 1 37 1 has generalized our results on quantum walk for element distinctness to an arbitrary graph 
with a large eigenvalue gap and cast them into the language of Markov chains. His main result is that, 
for a class of Markov chains, quantum walk algorithms are quadratically faster than the corresponding 
classical algorithm. An advantage of Szegedy's approach is that it can simultaneously handle any number 
of solutions (unlike in the present paper which has separate algorithms for single solution case (algorithm 
^ and multiple-solution case (algorithm|3ll). 

Buhrman and Spalek [15| have used Szegedy's result to construct an 0(n^/^) quantum algorithm for 
verifying if a product of two n x n matrices A and B is equal to a third matrix C. 

2 Preliminaries 

2.1 Quantum query algorithms 

Let [N] denote {1, . . . , iV}. We consider 

Element Distinctness. Given numbers xi, . . . , xj\[ G [M], are there i, i G [N], i ^ j such that Xi = Xjl 
Element distinctness is a particular case of 

Element /c-distinctness. Given numbers xi, . . . ,xn G are there k distinct indices ii, . . . ,ik G [N] 
such that = = • . . = Xj^,? 

We call such k indices ii, . . . , a k-collision. 

Our model is the quantum query model (for surveys on query model, see |2l fTSlD . In this model, 
our goal is to compute a function /(xi, . . . ,xn). For example, fc-distinctness is viewed as the function 
/(xi, . . . ,xn) which is 1 if there exists a A;-collision consisting of ii, . . . , G [N] and otherwise. 

The input variables Xj can be accessed by queries to an oracle X and the complexity of / is the number 
of queries needed to compute /. A quantum computation with T queries is just a sequence of unitary 
transformations 

Uq^O^Ui^O^...^ Ut-i -^O^Ut. 

Uj's can be arbitrary unitary transformations that do not depend on the input bits xi, . . . , xn. O are 
query (oracle) transformations. To define O, we represent basis states as \i, a, z) where i consists of [log A^] 
bits, a consists of [logM] quantum bits and z consists of all other bits. Then, O maps \i, a, z) to |i, (a + 

Xi) mod M, z). 

In our algorithm, we use queries in two situations. The first situation is when a = |0). Then, the state 
before the query is some superposition J2i z '^i,z\h 0) arid the state after the query is the same superpo- 
sition with the information about Xj: X^j^ ai,z\h Xi, z). The second situation is when the state before the 
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query is J2i z cti,z\h —xi mod M, z) with the information about Xi from a previous query. Then, apply- 
ing the query transformation makes the state Yl,i z '^i,z\h 0, z), erasing the information about Xi. This can 
be used to erase the information about Xj from J2i z'^i,z\hXi, z). We first perform a unitary that maps 
\xi) —>■ \ — Xi mod M), obtaining the state J2i z o:i,z\h —Xi mod M, z) and then apply the query transfor- 
mation. 

The computation starts with a state |0). Then, we apply Uq, O, . . ., O, Ut and measure the final state. 
The result of the computation is the rightmost bit of the state obtained by the measurement. 

We say that the quantum computation computes / with bounded error if, for every x = {xi, . . . , xn), 
the probability that the rightmost bit of UtOxUt-i ■ ■ ■ OxUo\0) equals /(xi, . . . , xat) is at least 1 — e for 
some fixed e < 1/2. 

To simplify the exposition, we occasionally describe a quantum computation as a classical algorithm 
with several quantum subroutines of the form UtOxUt-i ■ ■ ■ OxUo\0). Any such classical algorithm with 
quantum subroutines can be transformed into an equivalent sequence UtOxUt-i ■ ■ ■ O^C/ojO) with the num- 
ber of queries being equal to the number of queries in the classical algorithm plus the sum of numbers of 
queries in all quantum subroutines. 

Comparison oracle. In a different version of query model, we are only allowed comparison queries. In 
a comparison query, we give two indices i, j to the oracle. The oracle answers whether Xi < xj or Xi > xj. 
In the quantum model, we can query the comparison oracle with a superposition J2ij,z o-i,j,z\hjj z), where 
i,j are the indices being queried and z is the rest of quantum state. The oracle then performs a unitary 
transformation z) —>■ z) for all i, j, z such that Xi < Xj and |z, j, z) —>■ |i, j, z) for all z such 

that Xi > Xj. In section |6l we show that our algorithms can be adapted to this model with a logarithmic 
increase in the number of queries. 

2.2 d-wise independence 

To make our algorithms efficient in terms of running time and, in the case of multiple-solution algorithm in 
section |5j also space, we use d-wise independent functions. A reader who is only interested in the query 
complexity of the algorithms may skip this subsection. 

Definition 1 Let T he a family of functions f : [A^] {0, 1}. is d-wise independent if for all d-tuples 
of pairwise distinct ii, . . . ,id £ [N] and all ci, . . . , Cd £ {0, 1}, 

Pr[fih) = ci, /(i2) = C2, . . . , fiid) = Cd] = ^. 

Theorem 1 There exists a d-wise independent family T = S of functions fj : [N] {0, 1} 
such that: 

1. R = 0{NW2'^); 

2. fj{i) is computable in 0{dlog^ N) time, given j and i. 

We will also use families of permutations with a similar properties. It is not known how to construct 
small d-wise independent families of permutations. There are, however, constructions of approximately 
d-wise independent famiUes of permutations. 
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Definition 2 Let T he a family of permutations on / : [n] — > [n]. T is e-approximately d-wise independent 
if for all d-tuples ofpairwise distinct ii, . . . ,1^ £ [n] and pairwise distinct ji, ■ ■ ■ ,jd G [n], 



Pr[f{h) = hJ{i2) = J2, • • • , f{id) = 3d] G 



1 - e 



1 + e 



n{n — 1) . . . (n — d + 1) ' n(n — 1) . . . (n — d + 1) 



Theorem 2 M8\l Let n be an even power of a prime number For any d < n, e > 0, there exists an 
e-approximate d-wise independent family T = {vTjlj G [i?]} of permutations ttj : [n] [n] such that: 



2. TTj{i) is computable in 0(dlog^ n) time, given j and - 



3 Results and algorithms 

Our main results are 

Theorem 3 Element k-distinctness can be solved by a quantum algorithm with 0{N^^^^^'^'^) queries. In 
particular, element distinctness can be solved by a quantum algorithm with 0{N^^'^) queries. 

Theorem 4 Let r > k, r = o{N). There is a quantum algorithm that solves element distinctness with 
0(max(-^, r)) queries and and k-distinctness with 0(max(^:^^^j^, r)) queries, using 0(r (log Af+log N)) 
qubits of memory. 

Theorem|3lfollows from Theorem|4]by setting r = [A^^/^J for element distinctness and r = [A^'^/C^+i)] 
for A;-distinctness. (These values minimize the expressions for the number of queries in Theorem |4]) 

Next, we present Algorithms |2l which solves element distinctness if we have a promise that xi, . . . ,X]y 
are either all distinct or there is exactly one pair i / j, (and fc-distinctness if we have a 

promise that there is at most one set of k indices ii, . . . , such that xi^ = xi^ = . . . = Xi^). The proof 
of correctness of algorithm |2l is given in section |4] After that, in section |5l we present Algorithm |3l which 
solves the general case, using Algorithm |2 as a subroutine. 

3.1 Main ideas 

We start with an informal description of main ideas. For simplicity, we restrict to element distinctness and 
postpone the more general A:-distinctness till the end of this subsection. 

Let r = N"^/^. We define a graph G with (^) + {J^^ vertices. The vertices vs correspond to sets 
5 C [N] of size r and r + 1. Two vertices vs and vt are connected by an edge if T = S* U {%} for some 
i G [A^]. A vertex is marked if S contains xi = Xj. 

Element distinctness reduces to finding a marked vertex in this graph. If we find a marked vertex vs, 
then we know that for some i, j e S, i.e. xi, . . . , xat are not all distinct. 

The naive way to find a marked vertex would be to use Grover's quantum search algorithm E^fT^ . If 
e fraction of vertices are marked, then Grover's search finds a marked vertex after 0(4=) vertices. Assume 
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that there exists a single pair i,j G [N] such that i ^ j, xi = Xj. For a random S, \S\ = N^/^, the 
probability of vs being marked is 

Pr[i eS;jeS] = Pr[t G S]Pr\j G S\z G S] = ^^^^'j^ = (1 - ^(l))^- 

Thus, a quantum algorithm can find a marked vertex by examining 0{-^) = 0(iV^/^) vertices. However, 

to find out if a vertex is marked, the algorithm needs to query N^/^ items Xi, i E S. This makes the total 
query complexity OiN^^N'^/^) = 0{N), giving no speedup compared to the classical algorithm which 
queries all items. 

We improve on this naive algorithm by re-using the information from previous queries. Assume that we 
just checked if vs is marked by querying all Xi, i e S.lf the next vertex vt is such that T contains only m 
elements i ^ S, then we only need to query m elements Xi, i E T \ S instead of r = N^/^ elements Xi, 

i e T. 

To formalize this, we use the following model. At each moment, we are at one vertex of G (superposition 
of vertices in quantum case). In one time step, we can examine if the current vertex vs is marked and move 
to an adjacent vertex vt- Assume that there is an algorithm A that finds a marked vertex with M moves 
between vertices. Then, there is an algorithm that solves element distinctness in M + r steps, in a following 
way: 

1. We use r queries to query all Xi,ieS for the starting vertex vs. 

2. We then repeat the following two operations M times: 

(a) Check if the current vertex vs is marked. This can be done without any queries because we 
already know all Xj, i e S. 

(b) We simulate the algorithm A until the next move, find the vertex vt to which it moves from vs- 
We then move to vt, by querying Xi, i & T \ S. After that, we know all Xj, i & T. We then set 
S = T. 

The total number of queries is at most M + r, consisting of r queries for the first step and 1 query to 
simulate each move of A. 

In the next sections, we will show how to search this graph by quantum walk in 0(iV^/^) steps for 
element distinctness and 0(Ar'^/('=+^)) steps for ^-distinctness. 



3.2 The algorithm 

Let xi, . . . ,xj\f £ [M]. We consider two Hilbert spaces n and n'. n has dimension {^)M'-{N - r) and 
the basis states of H are l^,^,^) with S C [A^], l^l = r, x e [M]'', ?/ G [AT] \ S. H' has dimension 
{J^^)M^+^{r + 1). The basis states of are \S, x, y) with S C [AT], \S\=r + l,xe [MY+'^, yeS.Ovx 
algorithm thus uses 

O {^^W{N -r) + (^^^^W+^ir + l)^ = 0(r(log A + logM)) 

qubits of memory. 
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Apply the transformation mapping \S)\y) to 



\y) + 



N -r 



2 



on the S and y registers of the state in H. (This transformation is a variant of "diffusion transforma- 
tion" in do 

2. Map the state from Ji to Ji! by adding y to 5 and changing x to a vector of length A: + 1 by introducing 
in the location corresponding to y: 

3. Query for Xy and insert it into location of x corresponding to y. 

4. Apply the transformation mapping IS")]?/) to 



on the y register. 

5. Erase the element of x corresponding to new y by using it as the input to query for Xy. 

6. Map the state back to H by removing the component corresponding to y from x and removing y 



In the states used by our algorithm, x will always be equal to (xjj , . . . , Xj^) where ii, . . . , are elements 
of S in increasing order. 

We start by defining a quantum walk on Tl and H' (algorithm Each step of the quantum walk starts 
in a superposition of states in H. The first three steps map the state from H to Tl' and the last three steps 
map it back to H. 

If there is at most one ^-collision, we apply Algorithm |2l(ti and t2 are Cl^/r and C2(^)^/^ for constants 
ci and C2 which can be calculated from the analysis in section lU. This algorithm alternates quantum walk 
with a transformation that changes the phase if the current state contains a fc-coUision. We give a proof of 
correctness for Algorithm |2l in section |4] 

If there can be more one A;-collision, element ^-distinctness is solved by algorithmic] Algorithm |3l is a 
classical algorithm that randomly selects several subsets of Xi and runs algorithm|2lon each subset. We give 
Algorithm Id and its analysis in section |5l 




from S. 



Algorithm 1 : One step of quantum walk 
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1. Generate the uniform superposition —. — i H\s\=r,y<^s \S)\y) 



2. Query all Xi for i e S. This transforms the state to 

1 



E \s)\y)(^\xi 



r) ls\=r,y^S ieS 



3. h = 0{{N/r)^/'^) times repeat: 

(a) Apply the conditional phase flip (the transformation \S)\y)\x) — > —\S)\y)\x)) for S such that 

= = . . . = Xi^ for k distinct ii, . . . ,ik £ S. 

(b) Perform t2 = 0{^/r) steps of the quantum walk (algorithm^. 

4. Measure the final state. Check if S contains a /c-collision and answer "there is a fc-collision" or "there 
is no /c-coUision", according to the result. 



Algorithm 2: Single-solution algorithm 



4 Analysis of single A: -collision algorithm 
4.1 Overview 

The number of queries for algorithm |2l is r for creating the initial state and 0{{N /r)^/"^ ^/r) = 0( ^(^^^/2 ) 

for the rest of the algorithm. Thus, the overall number of queries is 0(max(r, ,^,^lC)/2 ))■ The correctness of 
algorithm ElfoUows from 

Theorem 5 Let the input xi, . . ., xn be such that = . . . = Xi^. for exactly one set of k distinct values 
ii, . . . , ik- With a constant probability, measuring the final state of algorithm^\gives S such that ii, . . . ,ik € 
S. 

Proof: The main ideas are as follows. We first show (Lemma that algorithm's state always stays in a 
2k + 1-dimensional subspace of Tl. After that (LemmaEJ, we find the eigenvalues for the unitary transfor- 
mation induced by one step of the quantum walk (algorithm 0, restricted to this subspace. We then look 
at algorithm 12 as a sequence of the form {U2UiY^ with Ui being a conditional phase flip and U2 being a 
unitary transformation whose eigenvalues have certain properties (in this case, U2 is ^2 steps of quantum 
walk). We then prove a general result (Lemma |3ll about such sequences, which implies that the algorithm 
finds the /c-collision with a constant probability. 

Let \S,y) be a shortcut for the basis state \S) (E>ie5 \xi)\y). In our algorithm, the \x) register of a 
state \S,x,y) always contains the state ^i^sl^-i)- Therefore, the state of the algorithm is always a linear 
combination of the basis states \S,y). 

We classify the basis states \S, y) {\S\ = r,y ^ S) into 2k + 1 types. A state \S, y) is of type {j, 0) if 
\Sr\{ii,.. . = j andy ^ {ii, . . . and of type (j, 1) if\Sr\{ii,.. . ,ik]\ = j and y G {ii, . . . 
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For j G {0, . . . , k — 1}, there are both type {j, 0) and type (j, 1) states. For j = k, there are only {k, 0) type 
states, {{k, 1) type is impossible because, if, \S Ci {ii, . . . , ik}\ = k, then y ^ S implies y ^ {ii, . . . , ik}-) 

Let \ipj^i) be the uniform superposition of basis states \S,y) of type Let H be the {2k + 1)- 

dimensional space spanned by states \ipj,i)- 

For the space H', its basis states IS*, y) (IS*! = r + 1, y G S) can be similarly classified into 2k + I types. 
We denote those types (j, /) with j = \S H {ii, . . . , I = 1 if y S {ii, . . . , ik} and / = otherwise. 
(Notice that, since y G 5 for the space H', we have type {k, 1) but no type (0, 1).) Let \(pj^i) be the uniform 
superposition of basis states \S,y) of type for space H'. Let H' be the {2k + l)-dimensional space 
spanned by l^^j,;)- Notice that the transformation \S, y) ^ \S U {y},y) maps 

We claim 

Lemma 1 In algorithm^} steps 1-3 map TC to TC' and steps 4-6 map TC' to H. 

Proof: In section l42l | 

Thus, algorithm n maps H to itself. Also, in algorithm |2j step l3al maps |V'fc,o) ~^ ~IV'fc,o) leaves 
lipj^i) for j < k unchanged (because \ipj,i), j < k aie superpositions of states \S, y) which are unchanged 
by step l3bl and iV'fc^o) is a superposition of states \S,y) which ai^e mapped to —\S,y) by step l3bt. Thus, 
every step of algorithm |2l maps to itself. Also, the starting state of algorithm |2l can be expressed as a 
combination of Therefore, it suffices to analyze algorithms Q and Son subspace TY. 

In this subspace, we will be interested in two particular states. Let start) be the uniform superposition 
of all |S, y), \S\ = r,y ^ S. Let iV'good) = |^fc,o) be the uniform superposition of all |5, y) with ii,. . . ,ik G 
S- \il^start) is the algorithm's starting state, \ipgood) is the state we would like to obtain (because measuring 
I'ipgood) gives a random set S such that {ii, . . . , i^} C S). 

We start by analyzing a single step of quantum walk. 

Lemma 2 Let U be the unitary transformation induced on H by one step of the quantum walk ( algorithm 
0. U has 2k +1 different eigenvalues in 7i. One of them is 1, with \ip start) being the eigenvector The other 
eigenvalues are e^^^*, . . ., e^^*^* with 6j = (2-v/j + o(l))-^. 

Proof: In section l4!2l | 

We set t2 = I'^^V^- Since one step of quantum walk fixes H, t2 steps fix H as well. Moreover, 

I'ipstart) will Still be an eigenvector with eigenvalue 1. The other 2k eigenvalues become e °^ >' _ 

Thus, every of those eigenvalues is e*^ with 6 G [c, 27r — c], for a constant c independent of and r. 

Let step Ui be step|3a|of algorithm |2l and U2 = f/*^ be step|3bl Then, the entire algorithm consists of 
applying ([/2f/i)*i to \%l)start)- We will apply 

Lemma 3 Let TC be a finite dimensional Hilbert space and \ipi), • • •. IV'm) be an orthonormal basis for Ti.. 
Let \^good)> li^start) be two states in TC which are superpositions of \ipi), . . ., \ipm) with real amplitudes and 
{'^goodl'^' start) = ct. Let Ui, U2 be Unitary transformations on TC with the following properties: 

L Ui is the transformation that flips the phase on \^good) f^ilV'good) = —\ipgood)) cind leaves any state 
orthogonal to \ilJgood) unchanged. 
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2. U2 is a transformation which is described by a real-valued mx m matrix in the basis • • iV'm)- 
Moreover, U2\'4'start) = \fpstart) and, iflip) is an eigenvector ofU2 perpendicular to \il)start), then 
^2|'0) = e^^\il)) for 9 G [e, 27r — e], 7^ vr (where e is a constant, e > 0)^ 

Then, there exists t = O(^) such that |('i/'good|(f^2t^i)*|V'start)| = ^(1)- (The constant under is 
independent of a but can depend on e.) 

Proof: In section H31 | 

By Lemma |3j we can set ti = O(^) so that the inner product of {U2UiY^ 14^ start) and \tpgood) is 
a constant. Since {ipgood) is a superposition of \S,y) over S satisfying C S, measuring 

{U2UiY^ lil^start) gives a set S satisfying {ii, . . . , i^} C S with a constant probability. 

It remains to calculate a. Let a' be the fraction of S satisfying {ii, . . . , i^} C S. Since {tpstart) is the 
uniform superposition of all \S, y) and \'4'good) is the uniform superposition of |5, y) with {ii, . . . , i^} C S 
we have a = 

{N-k\ k-1 _ ■ k 

a' = Pr[{n, . . . , C 5] = = ^ n ^ = (1 - 0(1))^. 

Therefore, a = and h = 0{{N/rf/^). | 

Lemma m might also be interesting by itself. It generalizes one of analyses of Grover's algorithm (3l. 
Informally, the lemma says that, in Grover-like sequence of transformations {U2U1Y, we can significantly 
relax the constraints on U2 and the algorithm will still give similar result. It is quite Ukely that such situations 
might appear in analysis of other algorithms. 

For the quantum walk for element ^-distinctness, Childs and Eisenberg l20ll have improved the analysis 
of lemma|5] by showing that {tpgoodl {U2UiY\ip start) (and, hence, algorithm's success probability) is 1— o(l). 
Their result, however, does not apply to arbitrary transformations Ui and U2 satisfying conditions of lemma 

in 



4.2 Proofs of Lemmas |T] and E] 

Proof: [of Lemma Q To show that is mapped to Ti' , it suffices to show that each of basis vectors 
IV'j,;) is mapped to a vector in Ti' . Consider vectors IV'j.o) and for j € {0, 1, . . . , — 1}. Fix S, 

IS n {zi, . . . , ik}\ = j. We divide [N] \ S into two sets Sq and Si. Let 

50 = {y:y€ [N]\S,y(^{H,...,ik}}, 

51 = {y:y£ [N]\ S,y G {h, . . . ,ik}}. 

Since \S H {ii, . . . ,ik}\ = j, Si contains si = k — j elements. Since Sq U Si = [N] \ S contains 
N — r elements, Sq contains sq = N — r — k + j elements. Define |V's,o) = , X]y65o 1*^' ^) ^^'^ 

l^-^,!) = ;yf = EyeSi l-S", y>- Then, we have 

1^-0) = 77^ E l^^,o) (1) 

VijAr-j) S:\S\=r 

|5n{ii,...,ifc}|=j 



^The requirement 6* 7^ vr is made to simplify the proof of the lemma. The lemma remains true if S = tt is allowed. At the end 
of section l43l we sketch how to modify the proof for this case. 
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and, similarly for and |'i/'S',i)- 

Consider the step 1 of algorithm^ applied to the state |V'S',o)- Let IV'50) ^^'^ resulting state. Since the 
IS") register is unchanged, iV'so) some superposition of states \S,y). Moreover, both the state \'4's,o) ^rid 
the transformation applied to this state in step 1 ai^e invariant under permutation of states \S,y), y £ Sq or 
states \S,y), y £ Si- Therefore, the resulting state must be invariant under such permutations as well. This 
means that every \S,y),y € 5*0 and every \S,y),y G Si has the same amplitude in q). This is equivalent 
to IV'50) ~ o|V'S',o) + ^IV'S',1) for some a, b. Because of equation Q, this means that step 1 maps IV'j^o) 
to ali/^jfi) + 6|V'j,i)- Steps 2 and 3 then map \^pjfi) to l^Pjfi) and to \ipj^i^i). Thus, iV'j^o) is mapped 
to a superposition of two basis states of H': \(pj.o) and 1). Similarly, iV'j,!) is mapped to a (different) 
superposition of those two states. 

For j = k, we only have one state | V'fe.o)- A similar argument shows that this state is unchanged by step 
1 and then mapped to \ipk,o) which belongs to Ti'. 

Thus, steps 1-3 map H to Ti'. The proof that steps 4-6 map H' to H is similar. | 
Proof: [of Lemma|2l We fix a basis for H consisting of \ipjfl), \tpj,i), j € {0, . . . , /c — 1} and \ipkfl) ^rid a 
basis for H' consisting of |y?o,o) and \(pj,i), \^j,o)^ i G {1; • • • j k}. Let be the matrix 



- l + 2e 2Ve - 
2Ve - e2 1 - 2e 



Claim 1 Lef Ui be the unitary transformation mapping Ti to Ti' induced by steps 1-3 of quantum walk. 
Then, Ui is described by a block diagonal matrix 



( D 








\ 


JV-r 











Dk-i . 










N-r 












.. D 1 









N-r 




V 








1; 



where the columns are in the basis iV'o.o). 1^0, 1) 
Iv'o.o). Iv'i.i). Iv'i.o). Iv'2,1). • • Wk,i), Wkfi)- 



1^1,0). iV'i.i). • • •. iV'fc.o) cind the rows are in the basis 



Proof: Let Tij be the 2-dimensional subspace of Ti spanned by |'i/'j,o) and iV'j,!)- Let Ti'^ be the 2- 
dimensional subspace of Ti' spanned by |'/7j,o) and 1). 

From the proof of Lemma^ we know that the subspace Tij is mapped to the subspace Ti'j. Thus, we 
have a block diagonal matrices with 2x2 blocks mapping Tij to Ti'j and 1x1 identity matrix mapping 
iV'fc 0) to Wk o)- It remains to show that the transformation from Tij to Ti'^ is D k-j . Let S be such that 

' ' N-r 

\S n {ii, . . . , ik}\ = j. Let So, Si, \ips,o), \i^s,i) be as in the proof of lemma[l] Then, step 1 of algorithm 
□maps \tps,o) to 

' E (f-i + ^)|5.!')+ E ' 



1 



yeSo 
2 



\S,y') 



N 



+ (so - 1^ 



N 



y'¥=y,y'ts ' 
y&So 



1 



yeSi 
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-1 + 



2so 



N-r 

By a similar calculation, is mapped to 



IV'5.0> + ^#^|V5,l) 



-1 + 



\ — r J JS — r 



1 



2so 



-\ips,o)- 



N-rJ' '' N-r' '' \ N-rJ' '' N-r 
Thus, step 1 produces the transformation D k-j on \ips,o) and |'i/'s,i)- Since iV'i.o) and iV'i,!) are uniform 
superpositions of |V'5,o) and |V'S',i) over all S, step 1 also produces the same transformation D k-j on 

' ' N — r ' 

and IV'j.i)- Steps 2 and 3 just map IV'j.o) to |<^j,o) and iV^j,!) to \(pjj^i^i). | 

Similarly, steps 4-6 give the transformation C/2 described by block-diagonal matrix 



Uo 



( 1 
D'^ 

r+l 










D' 



2 

r + l 







\ 






D' 



r+l ' 



from H' to TY. Here, D'^ denotes the matrix 

D' = \ 




1 - 2e 



A step of quantum walk is U = U2U1. Let V be the diagonal matrix with odd entries on the diagonal 
being -1 and even entries being 1. Since = /, we have U = U2V'^Ui = U2U[ for U2 = U2V and 
U[ = VUi. Let 



1 - 2e 



-2V~e 



2e 



Then, U[ and U2 are equal to Ui and U2, with every or D'^ replaced by corresponding E^. 7We 
will first diagonalize U[ and U2 separately and then argue that eigenvalues of U2U[ are almost the same as 
eigenvalues of C/g- 

Since U2 is block diagonal, it suffices to diagonalize each block. 1x1 identity block has eigenvalue 1. 
For a matrix E^, its characteristic polynomial is — (2 — 4e)A + 1 = and its roots are l-2e±2V7^i. 

/F+T for 



±{2+o(l)) 



For e = 0(1), this is equal to e^^^^"*^^^)*^. Thus, the eigenvalues of U2 are 1, and e 

j G {1, 2, . . . , k}. Similarly, the eigenvalues of U[ are 1, and g='=(2+''(^)) vjv-r* for j e {1,2, . . . , k}. 

To complete the proof, we use the following bound on the eigenvalues of the product of two matrices 
which follows from Hoffman- Wielandt theorem in matrix analysis |27|. 

Theorem 6 Let A and B be unitary matrices. Assume that A has eigenvalues 1 + 61, . . ., 1 + 6m, B has 
eigenvalues fii, . . ., fim and AB has eigenvalues fi'^, . . ., fi'^^. Then, 



i=l 



for all j G [m]. 
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Proof: In section l4!4l | 

Let A = U'i and B = C/g. Since le*^* — 1| < each of \5i\ is of order 0(-^==). Therefore, their sum 

is of order 0(--^==) as well. Thus, for each eigenvalue of U2, there is a corresponding eigenvalue of Ul^U'i 

that differs by at most by 0{ ^^ ^ ). The lemma now follows from = o( ^J_^-^ ). | 

4.3 Proof of Lemma |31 

We assume that \a\ < ce^ for some sufficiently small positive constant c. Otherwise, we can just take t = 

and get \ {i^good\{U2Uiy\iJstart)\ = \{lpgood\i^start)\ = \a\ > CC^ ■ 

Consider the eigenvalues of 1/2- Since C/2 is described by a real m x m matrix (in the basis . . ., 
\ipm)), its characteristic polynomial has real coefficients. Therefore, the eigenvalues are 1, -1, e^*^\ . . ., 
g±26», Prom conditions of the lemma, we know that the eigenvalue of e*'^ = — 1 never occurs. 

Let \wj^+), \wj-) be the eigenvectors of U2 with eigenvalues e*^^, e~*^J. Let \u)j^+) = J2f=i Cjj'lV'j')- 
Then, we can assume that {wj-) = J2f=iC*j,\'ipj'). (Since U2 is a real matrix, taking C/2|^^j,+) = 
e^^^\wj^+) and replacing every number with its complex conjugate gives U2\w) = e~^^^\w) for \w) = 

We write lipgood) in ^ basis consisting of eigenvectors of U2- 

I 

llpgood) = a\lpstart) + ^ (aj,+ + Oj _)). (2) 

i=i 

W. 1. o. g., assume that q is a positive real. (Otherwise, multiply \ipstart) by an appropriate factor to make 
a a positive real.) 

We can also assume that + = _ = aj, with aj being a positive real number. (To see that, let 
li^good) = J2f=i ^j'li^j')- Then, bj' are real (by the assumptions of Lemma I^J- We have (wj^+lipgood) = 
= Ef=ibj'(^lj' and {wj^-l^pgood) = aj,- = Y.f=ibj'{c*j,)* = iEf=ibj'C*j,)* = a*+. Multi- 
plying by and \ wj-) by makes both aj^+ and aj^- equal to = \aj^+\ which is a 
positive real.) 

Consider the vector 

\vf3) = a [1 + « cot - I lipstart) + aj [1 + I cot — 1 \wj^+) + aj [1 + I cot — 

(3) 

We will prove that, for some (3 = r2(a), {vf^) and \v_13) are eigenvectors of U2U1, with eigenvalues e^*^. 
After that, we show that the starting state \il)start) is close to the state + "^l^-/?)- Therefore, 

repeating U2U1 ^ times transforms iV'start) to a state close to + "^b-zs) which is equivalent to 

-^\vi3) — -^1^-/3). We then complete the proof by showing that this state has a constant inner product with 

Hgood)- 

We first state some bounds on trigonometric functions that will be used throughout the proof. 
Claim 2 i. ^ < sin x < xfor all x G [0, f ]; 
2. < cot x< i/oraZ/xe [0,f]. 
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We now start the proof by establishing a sufficient condition for \vp) and Iv-p) to be eigenvectors. We 
have \vf3) = \tpgood) + iW/^) where 

\v^)=a cot -li^start) + 2^ aj cot — \wj,+) + 2^ aj cot ^^^\wj-). (4) 

i=i j=i 

Claim 3 If\vp) is orthogonal to \ipgood)y then \vp) is an eigenvector ofU2U\ with an eigenvalue ofe^^^ and 
\v-i3) is an eigenvector ofU2U\ with an eigenvalue ofe~^^. 

Proof: Since \v'p) is orthogonal to IV'good)' we have U\\v'p) = and U\\vp) = —\ipgood) + "i-Wp). 
Therefore, 

U2Ui\vp) = a (^-1 + icot l^start) + I^^^e^^^ (-1 + ^cot + 
E%e-^^(-l + icot^^^ 



Wi 



Furthermore, 



sm X + I cos X e ^ 2 ^ 



1 + z cot X 



sm X sm X 



— sm a; + z cos x e ^ 2 
-1 + z cot a; = - 



Therefore, 



sm X sm X 



-1 + zcot^ 1 =e"^ ( 1 + icot^ ) , 



e*"^' ( -1 + zcot — '-^^ I = = e"' I 1 + zcot 



sm ■ 



2 



and similarly for the coefficient of \wj-). This means that ?72^i|^/3) = Vhs)- 

For |f _/3), we write out the inner products and (V'soodl^^'-^j)- Then, we see that {il^goodW_ij) = 

— {ipgoodWp)- Therefore, if \ipgood) and \v'p) are orthogonal, so are \ipgood) and \v'_p). By the argument 
above, this impUes that \v-p) is an eigenvector of U2U1 with an eigenvalue e"*'^. | 

Next, we use this necessary condition to bound /? for which \vp) and \v-p) are eigenvectors. 

Claim 4 T/zere exists P such that \v'p) is orthogonal to \ipgood) ^nd < P < 2.6a. 
Proof: Let f{p) = {tPgoodWp). We have 

2 P x^i ,2^ -dj + P Oi+p\ 
fiP) = cot I + ^ |a,f (^cot + cot -^^j . 
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We bound f{j3) from below and above, for (3 G [0, |]. For the first term, we have ^ < cot f < ^ (by claim 
For the second term, we have 

cot^^ + cot^- (5) 

cot; -i-coi - g _ (zi) 

^ ^ sm 2 sm — 

For the numerator, we have ^ < sin (3 < l3, because of Claim |2 The denominator can be bounded from 
below as follows: 

. 9j+p . Oj-p ^ . e . e ^ 

sm — sm — > sm — sm — > — -, 

2 2 - 2 4 - 27r2 ' 

with the first inequality following from 9j > e and /3 < f and the last inequality following from claim |2l 
This means 

2p e"^ p vr 

where we have used HV'goodlP = |oP + ^5I]j=i l^iP (by equation (|2li) and HV'goodll = 1 to replace 

The lower bound of equation Q implies that f(f3) > for /? = . ^ a. The upper bound implies 

-y/27r(l-a2) 



that /(/?) < for /? = -^=5:0. Since / is continuous, it must be the case that /(/?) = for some 
d G \—, — = — =a, ., q\. The claim now follows from < q < 0.1. | 

Let |ni) = il^^ and \u2) = ij^rf]! ■ We show that IV'start) is almost a linear combination of \ui) and 
|n2). Define \'ipend) = jj^^ where 

Claim 5 

\ui) = CstartiH start) + CendHend) + Wl) ■, 
\U2) = -CstartiH start) + CendlV'end) + ^2) 

where Cgtart, Cend cif^ positive real numbers and u'l, u'2 satisfy \\u'i\\ < — and \\u2\\ < —,for (5 from Claim 

m 

Proof: By regrouping terms in equation Q, we have 

\VI3) = aicot ^\tpstart) + \Vend) + W/^) (8) 



where 



v'^) = altpstart) + "i^ ( — ) 

i=i 
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+E 



Qji I cot 



cot -r ] \w 



We claim that \\v'^\\ < ^Ht"/?!!- We prove this by showing that the absolute value of each of coefficients in 
\v'^) is at most ^ times the absolute value of corresponding coefficient in \vp). The coefficient of \ipstart) 
is a in \v'^) and a(l + icot ^) in {vp). We have 

, . /3m P 8 

|a(l + zcot — )| > a cot — > a—, 
2 2 vrp 



which means that the absolute value of the coefficient of \^|Jstart) in \v'^) is at most ^ times the absolute 
value of the coefficient in Ivp). For the coefficient of the \wj^+), we have 



cot cot ■ 



sm ■ 



sm — sm 



ItOj - 13> f,then 



sm ■ 



■ sm ■ 



< 



P. 

2 



sm -J sm J 



1 1 

V2 V2 



1 + i cot 



If -/3 < f,then 



sm| 



sm — ^ — sm 



sm| 



cos — ^ — sm 



cot 



< 



v/2 T 



■ cot 



+ /3 



< 3 



cot 



with the first inequality following from | cos > \ cos f I = ^nd | sinx| = sin |x| > ^ (using 

Claim|2l- Therefore, the absolute value of coefficient of \wj^+) in {v'^) is at most ^ times the absolute value 



of the coefficient of \w 



in \vf3) (which is \aj{l + zcot ^ 



13/ 

)|). Similarly, we can bound the absolute 



value of coefficient of Iw^ ^) 



for c 



By dividing equation ©by ||?;^||, we get 

\ui) = CstartA'^ start) + Cend\i^end) + Wi) 

and |n'i) = T^|t^P- Since \\vp\\ < ^\\vp\\, we have \\u'i\\ < ^. The 



a cot ' 



start 



\\^I3\\ 



II ^end H 
Ik/Jll 



proof for U2 is similar. | 

Since and |u2) are eigenvectors of U2U1 with different eigenvalues, they must be orthogonal. There 
fore. 



{Ul\u2) 



"^tart + Cend + 0( — ) — 0, 



where 0(|) denotes a term that is at most const j in absolute value for some constant const that does not 
depend on (3 and e. Also, 



I l|2 _ 2 I 2 I rM^\ — 1 
[■"■ill — Cstart + Cg^d + — 
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These two equalities together with Cgtart and Cend being positive reals imply that Cgtart = '^^^(l^/^) ^'^^ 
Cend = 71 + 0{P/e). Therefore, 



\ui) = -^il-lp start) + -^llpend) + Wl) , 
^2) = --^i\^pstart) + -^li^end) + ^2) , 



with IKII = 0{P/e) and IKH = 0{P/e). This means that 

where w' and li;" are states with \\w'\\ = 0(/3/e) and = 0(/3/e). Let t = [^J. Then, (^72f^i)*|iii) is 
almost z|ni) (plus a term of order 0(/5)) and {U2UiY\u2) is almost — z|m2)- Therefore, 

{U2UiY\4^start) = li^end) + W) 

where = 0(/3/e). This means that 

\{i'good\{U2Uiy\^start)\ > \{^Pgood\i'end)\ " O(-). (9) 



e 



Since /? < 2.6a and a = ce^, we have 0(/3/e) = 0(e). By choosing c to be sufficiently small, we can make 
the 0(/3/e) term to be less than O.le. Then, Lemma|3lfollows from 

Claim 6 

\{i'good\lpend)\ > mill ^ ^ , ^ 6^ . 

Proof: Since |Vend) = |}|^, we have (V'goodl^end) = ^"^7;^ ^^jf^^ By detinition of \v^nd) (equation Q), 

{'4^good\vend) = 2 ^^-^^ o|. By equation HV'goodP = a^ + 2^^-=io|. Since HV'goodP = 1, we have 

{'^good\vend) = 1-0^. Therefore, {tpgood\i^end) > 

We have \\vendf = 2E^=i «j^(l + cot2 |). Since 0^ G [e,27r-e], \\vendf < 2Ej=i "^(l + cot^ f) < 
(1 + cot^ f ) and 

1 — 1 — / 1 — 1 — \ 



^l + cot2(e/2) 2max(l,cot 2) 



I 

If a is set to be sufficiently small, \ {ipgood\''Pend)\ is close to 0.5e and, together with equation this 
means that \ {ipgood\iU2Uiy\ipstart) \ is of order il(e). | 
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Remark. If C/2 has eigenvectors with eigenvalue -1, the equation ^ becomes 

I 

\ipgood) = a\^pstart) + ^ (aj,+ + _)) + ai+i\wi+i), 

i=i 

with \ wi^i) being an eigenvector with eigenvalue -1. We also add 0^+1(1— z tan —ai^ii tan 

and a;+i|it;;_|_i) terms to the right hand sides of equations Q and (ISj, respectively. Claims |3llll 13 and |6l 

remain true, but proofs of claims require some modifications to handle the {wi^i) term. 

4.4 Derivation of Theorem |6l 

In this section, we derive Theorem |6l (which was used in the proof of Lemma |2li from Hoffman-Wielandt 
inequality. 

Definition 3 For a matrix C = {oij), we define its l2-norm as \\C\\ = ^Jj2i,j I'^ljl- 
Theorem? l^pp. 292] If U is unitary, then \\UC\\ = \\C\\for any C. 

Tlieorem 8 1271 Theorem 6.3.5] Let C and D be m x m matrices. Let /ii, . . ., /im and fi'i, . . . , /x^ be 
eigenvalues of C and D, respectively. Then, 

m 

Y.i^^.-^^';)'<\\c-D\\\ 

i=l 

To derive theorem il from theorem [HI let C = S and D = AB. Then, C - D = {I - A)B. Since 
B is unitary, ||C — -D|| = ||/ — A|| (Theorem 0. Let [/ be a unitary matrix that diagonalizes A. Then, 
= I -UAU-'^ and \\I-A\\ = \\I -UAU~'^\\. Since f/AC/-i is a diagonal matrix with 1 + 5^ 
on the diagonal, / — UAU~^ is a diagonal matrix with 6i on the diagonal and ||/ — UAU^^W^ = 
By applying Theorem[51to / and U AU^^, we get 

m m 

J:i^^^-^^'i)'<J:\^^\'■ 

i=l i=l 

In particular, for every i, we have [fii — pj^)"^ < {J2iLi |<^«P) and 



1=1 i=l 

5 Analysis of multiple A;-collision algorithm 

To solve the general case of ^^-distinctness, we run Algorithm |2 several times, on subsets of the input 

The simplest approach is as follows. We first run Algorithm |2l on the entire input Xi, i G [A^]. We then 
chose a sequence of subsets Ti C [A^], T2 C [A^], . . . with Tj being arandom subset of size |Tj| = (^fxr)*-^' 



\ 
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1. LetTi = [A^]. Let j = 1. 

2. While \Tj\ > max(r, y/N) repeat: 

(a) Run Algorithm |2lon Xi, i G Tj, using memory size Vj = Measure the final state, obtaining 
a set 5. If there are k equal elements Xj, i ^ S, stop, answer "there is a /c-coUision". 

(b) Let Qj be an even power of a prime with \Tj\ < qj < (1 + 2p-)|^il- Select a random permu- 
tation TTj on [qj] from an -^-approximately 2k log A^-wise independent family of permutations 
(Theorem 

(c) Let 

Tj+i = jvrf ^2^^ . .7rTi(i),i G 

(d) Letj = j + 

3. If \Tj\ < r, query all Xi, i G Tj classically. If k equal elements are found, answer "there is a 
A;-collision", otherwise, answer "there is no /c-coUision". 

4. If \Tj\ < \/N, run Grover search on the set of at most N^/"^ A;-tuples (ii, . . . , z^) of pairwise distinct 
ii, . . . ,ik G Tj, searching for a tuple (ii, . . . , ik) such that Xj^ = . . . = Xj^.. If such a tuple is found, 
answer "there is a fc-coUision", otherwise, answer "there is no fe-coUision". 

Algorithm 3: Multiple-solution algorithm 



and run Algorithm|2lon Xi,i G Ti, then on Xi,i ^ T2 and so on. It can be shown that, if the input Xi,i ^ [N] 
contains a /c-collision, then with probability at least 1/2, there exists j such that Xi,i ^ Tj contains exactly 
one /c-collision. This means that running algorithm |2l on Xj , i G Tj finds the /c-collision with a constant 
probability. 

The difficulty with this solution is choosing subsets Tj. If we chose a subset of size 2k+i -^ uniformly 
at random, we need i7(A^) space to store the subset and Q{N) time to generate it. Thus, the straightforward 
implementation of this solution is efficient in terms of query complexity but not in terms of time or space. 
Algorithm |5]is a more complicated implementation of the same approach that also achieves time-efficiency 
and space-efficiency. 

We claim 

Theorem 9 (a) Algorithm\3\uses 0{r + ^,l^l[y2 ) queries. 

(b) Let p be the success probability of algorithm^ if there is exactly one k-collision. For any xi, . . . , xtv 
containing at least one k-collision, algorithm^ finds a k-collision with probability at least (1 — 
o{l))p/2. 

Proof: 

Part (a). The second to last step of algorithm |3l use at most r queries. The last step uses 0{N^^^) 
queries and is performed only if \/]V > r. In this case, ^^'l[y2 ^ jy(I-()/4 > N^^"^. Thus, the last two 



2k 
2k + I 



Qj 
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steps use 0(r + ^(^^^/2 ) queries and it suffices to siiow that algoritiim|51uses 0(r + ^^1^/2 ) queries in its 
second step (the while loop). 

Let Tj and Vj be as in algorithm 01 Then |Ti| = TV and |T,+i| < + ^)\Tj\. The number of 

queries in the j*^ iteration of the while loop is of the order 

|r,f/2 |r,f/2 \TAr iVe^-iVa r— |r,-|r 



^(k~i)/2 ' 1 (|T,-|r/iV)('^-i)/2 N r('=-i)/2 V i ' jy 



The total number of queries in the while loop is of the order 




(10) 

Part (b). If xi, . . . , xat contain exactly one fc-coUision, then running algorithm |2l on all of . . . , xat finds 
the fc-collision with probability at least p. If xi, . . . ,xn contain more than one A;-collision, we can have 
three cases: 

1. For some j, Tj contains more than one A;-collision but Tj+i contains exactly one /c-coUision. 

2. For some j, Tj contains more than one A;-collision but Tj-^^i contains no fc-coUisions. 



3. All Tj contain more than one fe-collision (till \Tj \ becomes smaller than max(r, vA^) and the loop is 
stopped). 

In the first case, performing algorithm |2l on Xj, j G Tj+i finds the /c-collision with probability at least p. 
In the second case, we have no guarantees about the probability at all. In the third case, the last step of 
algorithm |3]finds one of fc-coUisions with probability 1. 

We will show that the probability of the second case is always less than the probability of the first case 
plus an asymptotically small quantity. This implies that, with probability at least 1/2 — o(l), either first or 
third case occurs. Therefore, the probability of algorithmic] finding a fc-coUision is at least (1/2 — o{l))p. 
To complete the proof, we show 

Lemma 4 Let T be a set containing a k-collision. Let Nonej be the event that Xi,i G Tj contains no 
k-collision and Unique j be the event that Xi,i ^ Tj contains a unique k-collision. Then, 

Pr[Uniquej+i\Tj =T]> Pr[None,+i\Tj = T] - a (^) (H) 

where Pr[Uniquej-^-i\Tj = T] and Pr[Nonej^i\Tj = T] denote the conditional probabilities of Unique j-^i 
and Nonej-^i, ifTj = T. 

The probability of the first case is just the sum of probabilities 

Pr[Uniquej+i A Tj = T] = Pr[Tj = T]Pr[Uniquej^i\Tj = T] 
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over all j and T such that \T\ > max(r, \/iV) and T contains more than one A;-collision. The probability of 
the second case is a similar sum of probabilities 

Pr[Nonej+i A Tj = T] = Pr[Tj = T]Pr[Nonej+i\Tj = T]. 

Therefore, Pr[Uniquej+i\Tj = T] > Pr[Nonej+i\Tj = T] + o{j^) implies that the probability of 
the second case is less than the probability of the first case plus a term of order times the number 
of repetitions for the while loop. The number of repetitions is 0(A;log A^), because |Tj_|_i| < 2k+i + 
2^)|rj| < [1 — ■^)\Tj\. Therefore, the probability of the second case is less than the probability of the first 
case plus a term of order o( ^^^f/f^ ) = o(l). 

It remains to prove the lemma. 
Proof: [of LemmalU We fix the permutations vri, . . ., ttj-i and let ttj be chosen uniformly at random from 
the family of permutations given by Theorem |2l 

We consider two cases. The first case is when Tj contains many A;-collisions. We show that, in this case, 
the lemma is true because the probability of Nonej^i is small (of order o(-^^^)). The second case is if Tj 
contains few A;-collisions. In this case, we pick one x such that there are at least k elements i, Xi = x. We 
compare the probabilities that 

• Tj_|_i contains no /c-collisions; 

• Tj+i contains exactly one /c-coUision, consisting of i with Xi = x. 

The first event is the same as Nonejj^i, the second event implies Uniquej+i. We prove the lemma by 
showing that the probability of the second event is at least the probability of the first event minus a small 
amount. This is proven by first conditioning on Tj+i containing no /c-coUisions consisting of i with Xi^ x 
and then comparing the probability that less than k oii : xi = x belong to Tj+i with the probability that 
exactly k of i : Xi = x belong to Tj+i. 

Case 1. Tj contains at least log N pairwise disjoint sets Si = . . . , ii^k} with x^ ^ = . . . = x^ 

Let S = Si U S2 ■ ■ ■ U SiogN- If event Noncj+i occurs, at least log of vrjvrj_i . . . 7ri(i), i G S 
(at least one from each of sets Si, . . ., SiogAf) must belong to {[ ^fc^i gj] + • • • j Ij}- By the next claim, 
this probability is almost the same as the probability that at least log N of k log N random elements of [qj] 
belong to {[2|^gj] +l,...,qj}. 

Claim? Lets Q Tj, \S\ < 2klogN. LetV C [qj]\'^\. Let p be the probability that {'KjUj-i . . . iTi{i))i(zs 
belongs to V and let p' be the probability that a tuple consisting of\S\ uniformly random elements of [qj] 
belongs to V. Then, 

p-p' < ^ . 

Proof: Let S' = {TTj-i ■ ■ ■ vri(i)|i G S}. Then, p is the probability that {TTj{i))i^s' belongs to V. Let p" 
be the probability that {vi, . . . , v\s\) belongs to V, for {vi, . . . , v\g\) picked uniformly at random among all 
tuples of l^l distinct elements of [qj]. By Definitional [p — p"[ < jj- 

It remains to bound [p" — p'[. If (ui, . . . , is picked uniformly at random among tuples of distinct 
elements, every tuple of IS*! distinct elements has a probability .\q -\s\+i) ^^'^ tuples of non- 

distinct elements have probability 0. If (fi, . . . , f is uniformly at random among all tuples, every tuple 
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has probability Therefore, 

q. 

g,(g,-l)...(g,-|g| + l) . . . (g, - |g| + 1) / _ g,- . . . (g,- - |g| + 1 ) 

^3 ^3 \ ^3 



which impUes 



We have 



\p y \ ^ ^ \s\ 

^3 



^ g,(g,-l)...(g,-|g| + l) f g^-'^' V'"^! fi 1^1'^ _ 1^1' 

gf V 9i / V «i / «i 

I 

The probability that, out of A;log uniformly random ii, . . . ,ikiogN G {1; • • • j'Zj}' l^^st log 
belong to { \ 2k+i ^3~\ + • • • i Qj} be bounded using Chernoff bounds l33l . Let X; be a random variable 
that is 1 if e { l^i^Qj] +!,•••, Qil- Let X = Xi + . . . + XkiogN- We need to bound Pr[X > log A^]. 
We have E[X] = klogN- E[Xi] = log N - o(l) and 

Pr[X>logiV]< (^^^^^ I =e-0-3i6-iog^ = o^ ^ 



with the first inequality following from Theorem 4.4 of (Pr[X > (1 + < ( (1+^)1+^ )^'^^ for 

X that is a sum of independent identically distributed 0-1 valued random variables). By combining this 
bound with Claim0 the probability of Nonejj^i is 

1 \ (A;logiV)2 + l / 1 
H = o 



ivV4; q. ym/"^ 

where we used qj > \Tj\ > ^/N (otherwise, the algorithm finishes the while loop). 
Case 2. Tj contains less than log N pairwise disjoint sets 5/ = . . . , h^k} with ^ = . . . = Xi^ 
Let S be the set of all i such that Xi is a part of a /c-coUision among Xj, i ^ Tj. 

Claims \S\ < 2k\ogN. 

Proof: We first select a maximal collection of pairwise disjoint Si. This collection contains less than k log 

elements. It remains to prove that jS" — Ui5;| < k log A^. 

Since the collection {Si} is maximal, any /c-collision between Xi, i ^ Tj must involve at least one 

element from U;5/. Therefore, for any x, S \ UiSi contains at most k — 1 values i with Xi = x. Also, there 

are less than log A^ possible x because any fc-coUision must involve an element from one of sets 5/ and there 

are less than log A^ sets Si. This means that |5 — U^S*;! < (A; — 1) log A^. | 

Let yi, 2/2, • • • be an enumeration of all distinct y such that Tj contains a /c-coUision ii, . . . ,ik with 
= . . . = Xjj. = y. Let UniqueColli be the event that Tj+i contains exactly one fc-coUision ii, . . . ,ik 
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with Xj^ 



yi and NoColli be the event that T,_|_i contains no such collision. The event 



Noncj^i is the same as /\i NoColli. The event Unique j-^i is implied by UniqueColli A Ai>i NoCoUi. 
Therefore, it suffices to show 



Pr 



/\ NoColk 
. I 



< Pr 



UniqueColli A /\ NoColli 
i>i 



+ 



2((2fclog Nf + 1) 



(12) 



The events UniqueColli and NoColli are equivalent to the cardinality of 

f \ 2k 
i : Xi = yi,i G Tj and vr,- . . . vri (z) G s 1, • • • 



2A; + 1 



9i 



being exactly k and less than k, respectively. 

By ClaimQ the probabilities of both /\^ NoColli and UniqueColli A A/>i NoColli change by at most 
(2fciogJV) +1 replace (tTj . . . TTi{i))ies by a tuple of |5| random elements of [qj]. Then, the events 

NoColli and UniqueColli are independent of events NoColli' and UniqueColli' for /' / Z. Therefore, 



Pr 



f\NoColli 
L i 



Pr[iVoCoZ/i] W Pr[NoColli], 
i>i 



Pr 



UniqueColli A /\ NoColli 
i>i 



Pr [UniqueColli] ]J Pr[NoColli]. 

l>i 



This means that, to show (fT2l for the actual probability distribution (vTj . . . 7ri(i))jg5', it suffices to prove 
Pr[UniqueColli] > Pr[NoColli] for tuples consisting of \S\ random elements. 

Let / be the set of all i G Tj such that Xi = yi- Let m = Notice that m > k (by definition of x and /). 
Let Pi be the event that exactly I of ttj . . . 7ri(i), i £ I belong to Tj+i. Then, Pr[UniqueColli] = Pr[Pk] 
and Pr[NoColli] = J2t=o P^lPi]- When ttj . . . iTi{i), i G / are replaced by random elements of [qj], we 
have 

Pr[Pi] ' ' ' 



Pr[Pi] 
Pr[Pi 



l+il 



1 



m—l 



2k + 1 J \2k + l, 
1 1 l+l 1 



2A; + 1 1 



1 

2fc+l 



m — l 2k 



For / < - 1, we have ^ = k- ™^ implies Pr[Pi] < ^Pr[Pk] and 



'2k 



k-1 



1 



ok 

1=0 \l=0 J 

which is equivalent to Pr\NoColl]\ < Pr[UniqueColli]. 
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6 Running time and other issues 



6.1 Comparison model 

Our algorithm can be adapted to the model of comparison queries similarly to the algorithm of 1 14 1. Instead 
of having the register je5 1 Xj ) , we have a register | ji , j2 , . . . , jV ) where \ji ) is the index of the l^^ smallest 
element in the set S. Given such register and y G [A^], we can add y to |ji, . . . , jV ) by binary search which 
takes 0(log A^'^/('^'+^)) = 0(log A^) queries. We can also remove a given x e [A^] in 0(log A^) queries by 
reversing this process. This gives an algorithm with 0{N''/^''+'^^ log N) queries. 

6.2 Running time 

So far, we have shown that our algorithm solves element A;-distinctness with 0(A^^/*^'^+^)) queries. In this 
section, we consider the actual running time of our algorithm (when non-query transformations are taken 
into account). 

Overview. All that we do between queries is Grover's diffusion operator which can be implemented in 
0(log A^) quantum time and some data structure operations on set S (for example, insertions and deletions). 

We now show how to store 5 in a classical data structure which supports the necessary operations 
in 0(log^(A^ + M)) time. In a sufficiently powerful quantum model, it is possible to transform these 
0(log^(A^ + A/)) time classical operations into 0(log'^(A^ + M)) step quantum computation. Then, our 
quantum algorithm runs in 0(A^^/*^'^^^^ log'^(A^ + M)) steps. We will first show this for the standard query 
model and then describe how the implementation should be modified for it to work in the comparison model. 

Required operations. To implement algorithm |2l we need the following operations: 

1. Adding y to 5" and storing Xy (step|2lof algorithm Q; 

2. Removing y from S and erasing Xy (step|5]of algorithm^; 

3. Checking if 5 contains ii, . . . ,ik, Xi^ = . . . = Xi^ (to perform the conditional phase flip in stepl3alof 
algorithm |2li; 

4. Diffusion transforms on \x) register in steps ^ and |4] of algorithm^ 

Additional requirements. Making a data structure part of quantum algorithm creates two subtle issues. 
First, there is the uniqueness problem. In many classical data structures, the same set S can be stored in 
many equivalent ways, depending on the order in which elements were added and removed. In the quantum 
case, this would mean that the basis state \S) is replaced by many states jS^), . . . which in addition to 
S store some information about the previous sets. This can have a very bad result. In the original quantum 
algorithm, we might have a\S) interfering with —a\S), resulting in amplitude for If a\S) — a\S) 
becomes a\S^) — a\S'^), there is no interference between \S^) and \S'^) and the result of the algorithm will 
be different. 

To avoid this problem, we need a data structure where the same set S C [A^] is always stored in the same 
way, independent of how S was created. 

Second, if we use a classical subroutine, it must terminate in a fixed time t. Only then, we can replace 
it by an 0{poly{t)) time quantum algorithm. The subroutines that take time t on average (but might take 
longer time sometimes) are not acceptable. 
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level 2 
level 1 
level 



■^0 



■^0 



Figure 1 : A skip list with 3 levels 



Model. To implement our algorithm, we use standard quantum circuit model, augmented with gates for 
random access to a quantum memory. A random access gate takes three inputs: \b) and \z), with b being 
a single qubit, z being an m-qubit register and i G [m] . It then implements the mapping 

\i,b,z) \i,Zi,zi...Zi_ibzi+i...Zm). 

Random access gates are not commonly used in quantum algorithms but are necessary in our case because, 
otherwise, simple data structure operations (for example, removing y from S) which require 0(log N) time 
classically would require Q,{r) time quantumly. 

In addition to random access gates, we allow the standard one and two qubit gates ||9|. 

Data structure: overview. Our data structure is a combination of a hash table and a skip list. We use the 
hash table to store pairs (i, Xj) in the memory and to access them when we need to find Xi for a given i. We 
use the skip list to keep the items sorted in the order of increasing Xi so that, when a new element i is added 
to S, we can quickly check if Xi is equal to any of xj , j G 5. 

We also maintain a variable v counting the number of different x G [M] such that the set S contains 
ii,... ,ik with Xi^ = . . . = Xif^ = X. 

Data structure: hash table. Our hash table consists of r buckets, each of which contains memory for 
[logA^] entries. Each entry uses 0(log^ + log M) qubits. The total memory is, thus, 0(r log^(A^ + M)), 
slightly more than in the case when we were only concerned about the number of queries. 

We hash {!,..., A^} to the r buckets using a fixed hash function h{i) = [i • r/N\ + 1. The j^^ bucket 
stores pairs (i, Xi) for i G such that h{i) = j, in the order of increasing i. 

In the case if there are more than [log N~\ entries with h{i) = j, the bucket only stores [log N~\ of them. 
This means that our data structure misfunctions. We will show that the probability of that happening is 
small. 

Besides the [log A^] entries, each bucket also contains memory for storing [log r J counters di, . . . , d^\og r\ ■ 
The counter di in the j*^ bucket counts the number of i G S such that h{i) = j. The counter di, I > 1 is 
only used if j is divisible by 2^. Then, it counts the number of i G such that j — 2' + 1 < h{i) < j. 

The entry for {i,Xi) contains {i,Xi), together with a memory for [log A^] + 1 pointers to other entries 
that are used to set up a skip list (described below). 

Data structure: skip list. In a skip list f35\, each i G S has a randomly assigned level /j between and 
Imax = [log N~\ . The skip list consists of Imax + 1 lists, from the level-0 list to the level-Z^ax list. The 
level-Z list contains alH G 5 with /j > /. Each element of the level-/ level list has a level-/ pointer pointing 
to the next element of the level-/ list (or if there is no next element). The skip list also uses one additional 
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"start" entry. This entry does not store any (z, Xi) but has Imax + 1 pointers, with the level-/ pointer pointing 
to the first element of the level-/ list. An example is shown in figure [2 

In our case, each list is in the order of increasing x^. (If several i have the same Xj, they are ordered by 
i.) Instead of storing an adress for a memory location, pointers store the value of the next element i E 5. 
Given i, we can find the entry for (i, Xi) by computing h{i) and searching the h{iY^ bucket. 

Given x, we can search the skip list as follows: 

1. Traverse the level-/moT list until we find the last element ii with Xi, < x. 

2. For each / = Imax — 1, Imax — 2, . . . , 0, traverse the level-/ list, starting at until the last element 
ii with < X. 

The result of the last stage is zq, the last element of the level-0 list (which contains all i G S) with < x. If 
we are given i and Xj, a similar search can find the last element which satisfies either Xj^, < Xj or Xj^ = Xj 
and iQ < i. This is the element which would precede i, if i was inserted into the skip list. 

It remains to specify the levels /j. The level /, is assigned to each i G [N] before the beginning of 
the computation and does not change during the computation, /j is equal to j with probability 1/2-'+^ for 
j < Imax and probability 1/2'™"- for j = Imax- 

The straightforward implementation (in which we chose the level independently for each i) has the 
drawback that we have to store the level for each of possible i G [N] which requires Q{N) time to choose 
the levels and ^}{N) space to store them. To avoid this problem, we define the levels using Imax functions 
/ii, /i2, . . . , : [N] {0, 1}. i G [N] belongs to level / (for / < Imax) if = . . . = hi{i) = 1 

but hi+i{i) = 0. i G [A^] belongs to level Imax if hi{i) = ... = hi^^^{i) = 1. Each hash function 
is picked uniformly at random from a d-wise independent family of hash functions (Theorem 0, for d = 

r4iog2iv+ii. 

In the quantum case, we augment the quantum state by an extra register holding \hi, . . . , hi^^^ ) . The 
register is initialized to a superposition in which every basis state \hi, . . . , hi^^^ ) has an equal amplitude. 
The register is then used to perform transformations dependent on /ii, . . . , hi^^^ on other registers. 

Operations: insertion and deletion. To add i to S, we first query the value Xj. Then, we compute h{i) 
and add [i, Xj) to the h{iy^^ bucket. If the bucket already contains some entries, we may move some of them 
so that, after inserting (i, Xj), the entries are still in the order of increasing i. We then add 1 to the counter 
di for the h{if^ bucket and the counter di for the ([^12')'^^ bucket, for each / G {2, . . . , [logrj}. We 
then update the skip list: 

1. Run the search for the last element before i (as described earlier). The search finds the last element ii 
before i on each level / G {0, . . . , Imax}- 

2. For each level / G {0, . . . , k}, let ji be the level-/ pointer of ii. Set the level-/ pointer of i to be equal 
to ji and the level-/ pointer of ii to be equal to i. 

After the update is complete, we use the skip list to find the smallest j such that xj = Xj and then use 
level-0 pointers to count if the number of j less than k, exactly k or more than k. If there are 

exactly k such j, we increase vhy I. (In this case, before adding i to S, there were k — I such j and, after 
adding i, there are k such j. Thus, the number of x such that S contains ii, . . . ,ik with Xj^ = . . . = Xj^. = x 
has increased by 1.) 
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An element i can be deleted from S by running this procedure in reverse. 

Operations: checking for A;-collisions. To check for /c-coUisions in set 5, we just check if v > 0. 

Operations: diffusion transform. As shown by Grover l26l . the following transformation on |1), . . ., 
n) can be implemented with 0(log n) elementary gates: 




To implement our transformation in the step |4] of Algorithm ^ we need to implement a 1-1 mapping / 
between between S and {1, . . . , l^l}. Once we have such mapping, we can carry out the transformation 
|y) \ f{y)) by |y)|0) —>■ \y)\f{y)) —>■ |0)|/(y)) where the first step is a calculation of f{y) from y and 
the second step is the reverse of a calculation of y from f{y). Then, we perform the transformation (fT3l on 

. . ., \ \S\) and then apply the transformation \f{y)) \y), mapping {1, . . . , |5|} back to S. 

The mapping / can be defined as follows. f{y) = fi{y) + f2{y) where fi{y) is the number of items 
i £ S that are mapped to buckets j, j < h{y) and f2{y) is the number of items y' < y that are mapped 
to bucket h{y). It is easy to see that / is 1-1 mapping from 5 to {1, ... , |5|}. /2(y) can be computed by 
counting the number of items in bucket h{y) in time 0(log A^). fi{y) can be computed as follows: 

1. Let i = 0, / = [logrj, s = 0. 

2. While / > repeat: 

(a) If i + 2' < y, add di from the {i + 2')*'" bucket to s;\eti = i + 2'; 

(b) Let / = / - 1; 

3. Return s as fi{y); 

The transformation in step [2 of algorithm [2 is implemented, using a similar 1-1 mapping / between 
between [N] \ 5 and {1, . . . , A^ - 

Uniqueness. It is easy to see that a set S is always stored in the same way. The values i G S" are always 
hashed to buckets by h in the same way and, in each bucket, the entries are located in the order of increasing 
i. The counters counting the number of entries in the buckets are uniquely determined by S. The structure 
of the skip list is also uniquely determined, once the functions hi, ... , hi^^^ are fixed. 

Guaranteed running time. We show that, for any S, the probability that lookup, insertion or deletion 
of some element takes more than 0(log^(A + M)) steps is very small. We then modify the algorithms 
for lookup, insertion or deletion so that they abort after clog'' (A + M) steps and show that this has no 
significant effect on the entire quantum search algorithm. More precisely, let 

S,yM,---Mmax 

be the state of the quantum algorithm after t steps (each step being the quantum translation of one data 
structure operation), using quantum translations of the perfect data structure operations (which do not fail 
but may take more than clog^ A steps). Here, \'4^SM,--;himaJ^ stands for the basis state corresponding to our 



27 



data structure storing S and Xi, i £ S, using the hash functions hi, ... , hi^^^. (Notice that the amplitude 
Ogy is independent of hi,. . . , h^^^, since hi, . . . , h^^^ all are equally likely.) 

We decompose \ipt) = + iV't"'^)' with \'ipt°°'^) consisting of {S,hi, . . . , hi^^^) for which the 

next operation successfully completes in clog^(A^ + M) steps and consisting of [S, hi, ... , hi^^^) 

for which the next operation fails to complete in clog^(A^ + M) steps. Let be the state of the quantum 
algorithm after t steps using the imperfect data structure algorithms which may abort. The next lemma is an 
adaptation of "hybrid argument" by Bennett et al. fTP\ to our context. 

Lemma 5 

Ut-^'t\\<i2n^i''\\- 
t'=i 

Proof: By induction. It suffices to show that 

Ut-i^t\\<Ut-i-4-i\\ + 2Ut^% 

To show that, we introduce an intermediate state Itp'^) which is obtained by applying the perfect trans- 
formations in the first t — 1 steps and the transformation which may fail in the last step. Then, 

Ut-4\\<Ut-i^'n\ + M-i^t\\- 

The second term, — •i/'I || is the same as \\ipt-i — ''Pt-iW because the states {i/jf) and \^p'|.) are obtained 
by applying the same unitary transformation (quantum translation of a data structure transformation which 
may fail) to states iV't-i) and \tlj[_i), respectively. To bound the first term, Wtjjt — let Up and Ut be the 
unitary transformations corresponding to perfect and imperfect version of the t*^ data structure operation. 
Then, \ipt) = Up\^pt- 1) and \ip[) = Ui\tpt-i). Since Up and U only differ for {S,hi,... , hi^^J for which 
the data structure operation does not finish in clog"^ N steps, we have 

\\^Pt - ^P[\\ = \\Up\i;t-i) - f/#t-i)|| = ll^pl^i) " < n^l-H 

I 

Lemma 6 For every t, = 0{j^). 

Proof: We assume that there is exactly one A;-collision = . . . = Xj^.. (If there is no A;-collisions, the 
checking step at the end of algorithm |2l ensures that the answer is correct. The case with more than one 
/c-coUision reduces to the case with exactly one A;-collision because of the analysis in section |5l) 

By Lemma n every basis state \S, x) of the same type has equal amplitude. Also, all hi, ... , hi^^^ 
have equal probabilities. Therefore, it suffices to show that, for any fixed s = \S Pi {ii, . . . and 
t = \{x} n {ii, . . . , ik}\, the fraction of \S,x,hi, . . . , hi^^J) for which the operation fails is at most 

There are two parts of the update operation which can fail: 

1. Hash table can overflow if more than [log N^^ elements i G 5 have the same h{i) = h; 

2. Update or lookup in the skip list can take more than c log'^ N steps. 
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For the first part, let s = \S Ci {ii, . . . ,ik}\- If more than [log A^] elements i € S have h{i) = j, 
then at least [log N~\ — s of them must belong to [N] \ {ii, . . . , i^}. We now show that, for a random set 
5" ^ [N] \ {ii, . . . , ifc}, IS"! = r — s the probability that more than [log N~\ — s of i £ S satisfy h{i) = j is 
small. 

We introduce random variables Xi, . . . , X^-s with Xi = I if h maps the l^^ element of S to j. We 
need to bound X = Xi + . . . + Xr^s- We have < E[Xi] < which means that E[Xi] = 

^ + 0{j^). (Here, we are assuming that k is a constant, s is also a constant because s < k.) Therefore, 
E[X] = {r- s)E[Xi] = 1 + o(l). 

The random variables Xi are negatively correlated: if one or more of Xi is equal to 1, then the probability 
that other variables Xii are equal to 1 decreases. Therefore l34ll . we can apply Chernoff bounds to bound 
Pr[X > log iV - s]. By using the bound Pr[X > (1 + S)E[X]] < ( J ^^^^ EHH, we get 

logAf-s-l / 1 \ 

Prix > log AT - si < = o — . 

^ ^ ^ (logiV- s)i°s^-^ \mj 

For the second part, we consider the time required for insertion of a new element. (Removing an element 
requires the same time, because it is done by running the insertion algorithm in reverse.) Adding {i,Xi) to 
the {h{i)y^ bucket requires comparing i to entries already in the bucket and, possibly, moving some of the 
entries so that they remain sorted in the order of increasing i. Since a bucket contains 0(log A^) entries and 
each entry uses log^ (A^ + Af ) bits, this can be done in 0(log'^(A^ + Af)) time. Updating counters di requires 
0(log A'^) time, for each of 0(log r) = 0(log A^) counters. 

To update the skip list, we first need to compute hi{i), . . ., hi^„^^{i)- This is the most time-consuming 
step, requiring 0{dlog^ N) = 0(log^ A^) steps for each of Imax = [log A^] functions hi. The total time 
for this step is 0(log^ A^). We then need to update the pointers in the skip list. We show that, for any fixed 
S, y (and random hi, ... , hi^^^), the probabihty that updating the pointers in the skip list takes more than 
clog^ A^ steps, is small. 

Each time when we access a pointer in the skip list, it may take 0(log^ A^) steps, because a pointer 
stores the number i of the next entry and, to find the entry {i, xi) itself, we have to compute h{i) and search 
the h{iY^ bucket which may contain log A^ entries, each of which uses log A^ bits to store i. Therefore, it 
suffices to show that the probability of a skip list operation accessing more than c log^ A^ pointers is small. 

We do that by proving that at most d = 4 log A^ + 1 pointer accesses are needed on each of log N -\- 1 
levels I. We first consider level 0. Let ji,j2, ... be the elements of S ordered so that Xj-^ < xj^ < Xj^ . . . 
(and, if Xj^ = Xj^^-^ for some j, then ji < ji+i). If the algorithm requires more than d pointer accesses 
on level 0, it must be the case that, for some i', jii, . . ., jV+d-i are all at level 0. That is equivalent to 
hi{ji') = = ... = = 0. Since hi is d-wise independent, the probability that 

= . . . = = is 2-'^ < N-\ 

For level Z (0 < / < Imax), we first fix the hash functions hi, . . . ,hi. Let ji,j2, ... be the elements 
of S for which hi, . . ., hi are all 1, ordered so that xj-^ < xj^ < Xj^ .... By the same argument, the 
probability that the algorithm needs d or more pointer accesses on level / is the same as the probability that 
hi+i{ji') = . . . = = for some i' and this probability is at most 2^'^ < N^^. For level 

I'max, we fix hash functions hi, ... , hi^^^^i and notice that i is on level Imax whenever hi^^,^ (i) = 1. The 
rest of the argument is as before, with hi^^^{ji,) = /iUa.(ii'+i) = • • • = /^Ua. O'i'+d-i) = 1 instead of 

hiUi') = = . . . = = 0. 
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Since there are log N + 1 levels and r elements of S, the probability that the algorithm spends more than 
k — 1 steps on one level for some element of S is at most O(^^l^) = O(^). 

Therefore, 1 1 V't"'^ f = O ( ^ ) and 1 1 V^^^ 1 1 = O ( ^ ) , proving the lemma. | 

By Lemmas|5]and|51 the distance between the final states of the ideal algorithm (where the data structures 
never fail) and the actual algorithm is of order O(-^y^) = 0{jj^). This also means that the probability 
distributions obtained by measuring the two states differ by at most 0(-^^^), in variational distance iT3l . 
Therefore, the imperfectness of the data structure operations does not have a significant effect. 

Implementation in comparison model. The implementation in comparison model is similar, except 
that the hash table only stores i instead of {i,Xi). 

7 Open problems 

1. Time-space tradeoffs. Our optimal 0(A^^/'^) -query algorithm requires space to store 0{N'^/^) items. 

How many queries do we need if algorithm's memory is restricted to r items? Our algorithm needs 
0(-^) queries and this is the best known. Curiously, the lower bound for deterministic algorithms in 

comparison query model is il(^) queries l38l which is quadratically more. This suggests that our 
algorithm might be optimal in this setting as well. However, the only lower bound is the Q.{N'^/^) 
lower bound for algorithms with unrestricted memory Q- 

2. Optimality of ^-distinctness algorithm. While element distinctness is known to require Q.{N'^/^) 
queries, it is open whether our 0(A^'^/('^+i)) query algorithm for fc-distinctness is optimal. 

The best lower bound for ^-distinctness is Q.{N'^/^), by a following argument. We take an instance of 
element distinctness xi, . . . ,xn and transform it into /c-distinctness by repeating every element k — \ 
times. If xi, . . . , x^ are all distinct, there is no k equal elements. If there are i,j such that Xi = xj 
among original N elements, then repeating each of them k — 1 times creates 2k — 2 equal elements. 
Therefore, solving ^-distinctness on {k — 1)N elements requires at least the same number of queries 
as solving distinctness on elements (which requires Q{N'^^^) queries). 

3. Quantum walks on other graphs. A quantum walk search algorithm based on similar ideas can 
be used for Grover search on grids |l8l|22l. What other graphs can quantum- walks based algorithms 
search? Is there a graph-theoretic property that determines if quantum walk algorithms work well on 
this graph? 

m and ll37l have shown that, for a class of graphs, the performance of quantum walk depends on 
certain expressions consisting of graph's eigenvalues. In particular, if a graph has a large eigenvalue 
gap, quantum walk search performs well |37|. A large eigenvalue gap is, however, not necessary, as 
shown by quantum search algorithms for grids I8. .37,l . 
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